perm filename CACHE.FIG[AM,DBL] blob sn#435201 filedate 1979-04-24 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	STORAGE PRINCIPLES
C00010 00003	UPDATING PRINCIPLES
C00013 ENDMK
CāŠ—;
STORAGE PRINCIPLES
------------------

       When (and when {\it not} to)

		Every time you have to call a lower order function.
		You called a function and it took a long time to return.
		The value returned has not changed since the last time you called it.
		The amount of time to recompute the value would be very high.
			E.g., after a lucky accident, where a repeat might
			be very costly or even fail entirely to duplicate
			your recent success at getting a solution.
		If (freq of usage)x(avg cost to recompute)/(freq of change)
			is high, then it pays off.
		If the value is so critical that the cost-benefit criteria
			(eg the resource limits are enormous) always favor
			recomputation rather than chancing a blunder,
			Then don't cache.
		If the function evaluates as fast as the caching mechanism itself,
		  	or if space is too tight, don't cache.
		F is called with the same argument x very often.


      Why

		Temporal cost of recomputing is higher than spatial cost of storage.
		Related to ``When" rules above; namely: in those situations, it's
			cost-effective to store a cache.
		Analogy to utility of hardware caching.
		Ability to utilize cached values as usage data to induct upon.

      Where

		The value to be cached should have a precise storage place.
		One extreme is an assertional database, with {\it no} structure.
		We prefer to add slot and subslot structuring, with some
		domain-specific tinkering of those two sets (especially 
		the former, the set of slots.)
		Thus an extreme example of prime numbers should be stored
		on a slot called extreme-examples, on a schema called primes.
		At access time, we look in the precise spot(s) where X would
		be, and if it isn't there, we compute and cache it right there.
		At worst, there should be a precise {\it set} of places where X
		might be, a well-defined path to search down for X (e.g., 
		a conjecture about prime numbers would either be an Example of
		Conjectures, or a Conjecture of Primes; if not in either of those
		places.)

      How

		Before (re)computing F(x), the program presumably searched in the
			``right place'' (see ``Where'' above) to find a cache of
			F(x) if one existed. At that time, set a pointer to that
			place, and when the value is found, place it there.
		Called functions might suggest how to cache their value in higher
		    calling caches 
			E.g., ``my value changes often so compile my definition
			and cache that, but don't waste resources caching this value.''
		Cache should be transparent and discardable (should be able to throw
		    	them all away if space is needed).
		If there are multiple places to store it (e.g., you learn that
			X is a PARENT of Y, but PARENT is a virtual slot defined as
			MOTHER $\union$ FATHER; if the cache is to be discardable, then
			we must store X under either Y's MOTHER or FATHER slots.)
			Then here are some strategies one might use:
			  Store it redundantly in several (all possible) places
			  Pick one of the places at random and store it there
			  Look up information about each place, to decide
			  Pick one of the places accurately by reasoning, testing, etc.

      What

		Store the returned value in any case.
		Good policy to store a description of the {\it call} which led to
			this value being computed.  E.g., make sure you
			store the amount of resources allotted, the
			time of day and the relative place 
			in the computation (e.g., task number), whether
			the user was allowed to be queried, etc.
		Along with the call, store data about the actual path to a solution.
			This includes algorithms used, time it took to compute,
			which other caches were used, etc.
		It may be useful to store other intermediate results of the process
			that led to the value's being computed successfully
			(e.g., a trap that was found and may crop up again.)
			This also includes storing some abstract, partially-evaluated
			symbolic expressions which generalize the value and its
			computation.
		If possible, predict (criteria, time-limit, etc.) conditions under
			which this will no longer be current. E.g., ``Depends
			on the definition of Y being known, and on there being 
			no examples of Z known, and...'')
			Include a discussion of the certainty of the result.
		It may be worthwhile to {\it stack} the previous cached values,
		  	enabling the program to determine if (when, how, how often)
			they're changing.
		If you choose not to cache, you may wish to store a marker recording
			--- and hopefully explaining --- that decision.


      How to eliminate caches

		Just reverse the criteria in ``When" (above), to see if it
			is no longer cost-effective to keep the cache.
		Remember that F(x) will occupy varying amounts of space,
			varying with F, x, and time. 
			AM, e.g., eliminated largest cached values first.
		Eliminate least frequently (or least recently) used caches first.
		In general, eliminate the caches that are least cost-effective.


UPDATING PRINCIPLES
-------------------


       When

		If the current request would have produced a different answer.
		Usually, only do this if the current request will produce a
			distinctly {\it better} answer. For instance:
		If the current request specifies more precision in the answer.
		If the resource allocation is high enough to get a better
			or more current answer than the one cached already.
		Trivially:  If there is no cached answer available.
		Typically: x or the definition of F changed since F(x) was cached.


       Why

		Frame problem: if the value is old, it's hard to know for sure that
			the world hasn't changed since it was computed.
		If new value is computed (either intentionlly or accidentally),
			and its freshness (or the other conditions of its call)
			makes it ``better" than the existing cached value.

       How

		Discard the old value, or remember it elsewhere (maybe on disk).
		Can use demon traps that flag the cache as out of date, without
			bothering to recompute the value if not required to.
		If the new value is found, simply store it.
		The user may want to know about the updating.
		Many other cached values may depend upon this one, and they may have
			to be ferreted out and updated also, or at least some
			information should be left posted so they can later determine
			that they may be out of date.
		Usually: discard the whole old value, store just the new one.
		If the program is very smart: When the world changes, try to
			propogate that change through the network of caches,
			changing {\it them}. This is much better than erasing them!